132 research outputs found

    Revisiting Numerical Pattern Mining with Formal Concept Analysis

    Get PDF
    In this paper, we investigate the problem of mining numerical data in the framework of Formal Concept Analysis. The usual way is to use a scaling procedure --transforming numerical attributes into binary ones-- leading either to a loss of information or of efficiency, in particular w.r.t. the volume of extracted patterns. By contrast, we propose to directly work on numerical data in a more precise and efficient way, and we prove it. For that, the notions of closed patterns, generators and equivalent classes are revisited in the numerical context. Moreover, two original algorithms are proposed and used in an evaluation involving real-world data, showing the predominance of the present approach

    Mining Biclusters of Similar Values with Triadic Concept Analysis

    Get PDF
    Biclustering numerical data became a popular data-mining task in the beginning of 2000's, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute data-table. So called biclusters of similar values can be thought as maximal sub-tables with close values. Only few methods address a complete, correct and non redundant enumeration of such patterns, which is a well-known intractable problem, while no formal framework exists. In this paper, we introduce important links between biclustering and formal concept analysis. More specifically, we originally show that Triadic Concept Analysis (TCA), provides a nice mathematical framework for biclustering. Interestingly, existing algorithms of TCA, that usually apply on binary data, can be used (directly or with slight modifications) after a preprocessing step for extracting maximal biclusters of similar values.Comment: Concept Lattices and their Applications (CLA) (2011

    On mining complex sequential data by means of FCA and pattern structures

    Get PDF
    Nowadays data sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using the elegant mathematical framework of Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patient patterns from a French healthcare data set on cancer. The quantitative and qualitative results (with annotations and analysis from a physician) are reported in this use case which is the main motivation for this work. Keywords: data mining; formal concept analysis; pattern structures; projections; sequences; sequential data.Comment: An accepted publication in International Journal of General Systems. The paper is created in the wake of the conference on Concept Lattice and their Applications (CLA'2013). 27 pages, 9 figures, 3 table
    corecore